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Abstract 

Quantum computer algorithms can exploit the structure of random satisfiability problems. 
This paper extends a previous empirical evaluation of such an algorithm and gives an approx- 
imate asymptotic analysis accounting for both the average and variation of amplitudes among 
search states with the same costs. The analysis predicts good performance, on average, for a 
variety of problems including those near a phase transition associated with a high concentra- 
tion of hard cases. Based on empirical evaluation for small problems, modifying the algorithm 
in light of this analysis improves its performance. The algorithm improves on both GSAT, a 
commonly used conventional heuristic, and quantum algorithms ignoring problem structure. 

1 Introduction 

Peter Shor's polynomial-time factoring algorithm [^7], || showed quantum computers jllj |l2|, [l5| |5(J 
efficiently solve an important problem thought to require exponential time on our current, "classical" , 
machines. Can quantum computers significantly improve other apparently intractable problems? 
At first sight, combinatorial searches, such as arise in scheduling, theorem proving, cryptography, 
genetics and statistical physics, are one possibility. This is because many such searches are "nonde- 
terministic polynomial" (NP) problems | lS|j , which have a rapid test of whether a candidate solution 
is in fact a solution and an exponential growth in the number of candidates with the size of the prob- 
lem. Quantum computers can test all candidates in superposition with about as many operations as 
a classical machine uses to test just one, suggesting large improvements are possible. Unfortunately, 
the difficulty of extracting a solution from the superposition appears to preclude rapid solution of 
of at least some NP problems ||. 

Nevertheless, quantum computers may offer substantial improvement for typical searches en- 
countered in practice. For instance, constraint satisfaction problems |4Q] consist of constraints on 
the values various combinations of variables can take. A candidate solution for such problems can 
not only be evaluated in terms of whether it satisfies all the constraints, but also in terms of how 
many constraints it violates. This additional information is often a useful guide to finding solutions, 
providing the basis for conventional heuristic searches. Such heuristics are substantially better than 
simpler techniques ignoring problem structure. For heuristics consisting of repeated independent 
trials, Grover's amplitude amplification p2] gives a quadratic speedup with quantum computers H], 



1 



the best possible improvement for quantum methods based only on the test of whether candidates 
are solutions S. 

Any possibility of greater speedup requires a quantum algorithm using additional problem prop- 
erties. For some small or relatively easy problems such algorithms perform well |2(| |49| , More gen- 
erally, quantum methods readily exploit precise information on states' distances to a solution p3[ , 
but such information is not readily available for hard searches. Thus an important question is 
whether, and to what extent, quantum computers can exploit readily computed properties of hard 
search problems. In particular, can they perform significantly more efficiently than classical heuristic 
methods? 

This paper discusses a previous structured quantum search algorithm |^7j , based on evaluating, in 
superposition, the number of conflicts in all search states. The paper extends empirical evaluation of 
the algorithm's behavior for a class of hard search problems and compares it to a version of amplitude 
amplification not requiring prior knowledge of the number of solutions 0. Furthermore, the paper 
gives an approximate asymptotic performance analysis that includes variation among amplitudes 
associated with states with the same number of conflicts. Specifically, the next two sections describe 
a class of hard search problems and the form of the quantum algorithm. The remainder of the 
paper presents the extended asymptotic analysis and compares with actual behavior based on small 
problem sizes feasible to evaluate via simulation on conventional machines. 

As a note on notation, to compare the growth rates of various functions we use [^l] f = O (g) 
to indicate that / grows no faster than g as a function of n when n — > oo. Conversely, / = Q (g) 
means / grows at least as fast as g, and / = G (g) means both functions grow at the same rate. 



2 An Ensemble of Hard Satisfiability Problems 

Heuristics are often too complicated to allow exact analytical evaluation of their performance. In- 
stead, they are usually evaluated empirically on a sample of problems. Such a test requires a hard 
problem ensemble, i.e., a class of problem instances and associated probability distribution for their 
selection with a high concentration of hard cases. For practical use, instances of the ensemble should 
be computationally easy to generate. Since typical instances of NP problems are often much easier 
than worst case analyses suggest, defining such ensembles is not trivial. Fortunately, such ensembles 
exist for a variety of NP-complete search problems [Q, [?6| [29). Significantly, problems from such 
ensembles, associated with abrupt "phase transitions" in behavior, are particularly difficult for a 
variety of heuristics, on average. They thus provide good test cases. 

The fc-satisfiability (fc-SAT) problem provides one example. It consists of n Boolean variables 
and m clauses. A clause is a logical OR of fc variables, each of which may be negated. A solution is 
an assignment, i.e., a value, true or false, for each variable, satisfying all the clauses. An assignment 
is said to conflict with any clause it doesn't satisfy. An example 2-SAT problem instance with 3 
variables and 2 clauses is v\ OR (NOT u 2 ) and V2 OR V3, which has 4 solutions, e.g., v\ — false, 
V2 = false and V3 = true. For k > 3, fc-SAT is NP-complete [|8|, i.e., is among the most difficult NP 
problems. 

For assignments r and s, which can be viewed as bit-vectors of length n, let d(r, s) be the 
Hamming distance between them, i.e., the number of variables they assign different values. Let c(s) 
denote the number of the m clauses conflicting with s, which depends on the particular problem 
instance considered. The quantity c(s) can also be thought of as the cost associated with the 
assignment, and those with zero cost are solutions. 

The random fc-SAT ensemble with given n and m consists of instances whose m clauses are 
selected uniformly at random. Specifically, for each clause, a set of fc variables is selected randomly 
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from among the (^) possibilities. Then each of the selected variables is negated with probability 
1/2 to produce the clause. Thus each of the m clauses is selected, with replacement, uniformly 
from among the iV c i auses = (^)2 fc possible clauses. The difficulty of solving such randomly generated 
problems varies greatly from one instance to the next. This ensemble has a high concentration of 
hard instances when /i = m/n is near a phase transition in search difficulty [M [36], p9[ . At this 
transition, the fraction of soluble instances drops abruptly from near 1 to near 0. For random 3-SAT 
this transition is at about fj, = 4.25, the value used for the results presented here as well as extensive 
prior studies of classical heuristics for SAT. For soluble problems near the transition, the number of 
solutions S is exponentially large but a tiny fraction of all states, i.e., S/2 n is exponentially small. 



3 The Algorithm 

The quantum algorithm examined here ]2^ | has the same general form as amplitude amplification ]22] | 
but with amplitude phase adjustments based on the state costs and the problem ensemble parame- 
ters, i.e., n, k and m for random fc-SAT. Importantly, the algorithm does not require characteristics 
of the problem instance that are costly to compute, e.g., the number of solutions. 

The overall algorithm consists of a series of trials, each operating with superpositions of all 2 n 
assignments. Superpositions correspond to vectors with an amplitude for each assignment. After 
each trial, a measurement produces a single assignment. Trials repeat until a solution is found. 
Quantum coherence need persist only for the duration of each trial, rather than over all trials. For 
the case considered here, this duration grows linearly with n thereby placing less stringent coherence 
requirements on the hardware than amplitude amplification whose trial duration grows exponentially 
with n for hard problems (because, for hard problems, the number of solutions is an exponentially 
small fraction of the total number of states) . 

A trial performs a series of j steps. Each step evaluates the costs associated with all assignments 
and mixes amplitudes among them based on their Hamming distances. Starting with an equal 
superposition of all 2™ assignments, i.e., = 2~"/ 2 , the superposition vector ipW after j steps is 

^.[/«pW...c/Wp( 1 V< ) (1) 

The algorithm involves two types of matrices: the diagonal phase adjustments P", depending on 
the particular problem instance, and the matrix U^ h \ mixing amplitudes among states without 
regard to the particular instance. 

Specifically, is diagonal with P S g = e m p( h - c ( s )) where c(s) is the number of conflicts in 

assignment s and p is an arbitrary computationally-efficient real-valued function. Since c(s) itself is 
efficiently evaluated (by comparing the state with each of the m clauses) and has only m + 1 = 9 (n) 
possible values, 0, . . . , m, quantum computers efficiently implement this matrix operation ]3C| as a 
generalization of the technique used for amplitude amplification. 

Viewing assignments as strings of n bits, let W be the Walsh-transform, W rs — 2~™/ 2 (— l)l rAs 
where \r As\ is the number of l's the two assignments have in common. We define the mixing matrix 
as = WT^W where T^> is diagonal with T^> = e^^'H), \s\ denotes the number of 1-bits 
in s and r is another computationally-efficient real-valued function. With these definitions, f/jv 
depends only on the distance d(r,s) ]27[ ], i.e., has the form uffi — w^frs)' Quantum computers 
evaluate this matrix operation efficiently |^2|, ||, |3(J . 

Observing the final superposition gives an assignment having c conflicts with probability 

s\c(s)—c 
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with the sum over all assignments with c conflicts. In particular, P so in(i) = p^'(0) is the probability 
to find a solution in a single trial. 

Completing the algorithm requires specifying functional forms for the phase adjustment functions 
p and r. Since c(s) and \s\ are integers, the matrix elements are unchanged by adding any multiple 
of 2 to either p or r. Moreover, changing the sign of both values simply conjugates the matrix 
elements. Thus, without loss of generality, we can restrict the p values to be in the range (—1,1] 
and r in [0, 1]. In the remainder of this section we describe the special case equivalent to amplitude 
amplification and then discuss one way to include problem structure. 

3.1 Amplitude Amplification 

In the notation introduced above, amplitude amplification consists of the choices 

p{h,c) = | 

r(h,b) = { 



if c = 
otherwise 

if b = 
otherwise 



These choices, which are the same for all steps (i.e., independent of h), cause P to invert the 
amplitude of solutions and make U a diffusion matrix with Ud = —Sdo + 2 1_ ™ where 5 a b is one if 
a = b and zero otherwise. Note the off-diagonal elements of U are exponentially small. 

By treating all nonsolution states equally, this algorithm has the major advantage of a simple 
expression for the probability to find a solution after j steps, namely Q 

Psoinfj) = sin((2j + 1)0) 2 (2) 

where 9 — sin -1 \J S/2 n and S is the number of solutions. For hard, soluble random /c-SAT, 2" 3> 
S ^s> 1 and 9 ~ \J S/2 n is exponentially small. Thus the algorithm can give P S oin = (1) when 
j = f2 (1/9), i.e., after an exponentially large number of steps for hard problems. 

In practice, S is not known a priori, so the best choice for the number of steps j cannot be 
determined. A useful alternative selects j differently for each trial as follows 0]: Starting with 
M = 1, 

• perform a single amplitude amplification trial with the number of steps j selected randomly 
between and M — 1 

• if a solution is found, stop. Otherwise, set M = min(2™/ 2 , 6M/5) and repeat. 

This procedure increases the expected cost, compared to having prior knowledge of S, by at most a 
factor of 4 0]. For the sake of definite comparison with other choices for p and t, we describe how 
to evaluate the expected number of steps to find a solution. 

In light of Eq. |^, selecting the number of steps j uniformly at random between and M — 1, 
gives the probability to obtain a solution 0J 

1 „ i \ 1 sin(4Af<9) 

Pr, adom (M) = - ^ P Boln 0) = - - 4Ms . n(20) (3) 

which approaches 1/2 as M increases. 
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The trial with a given M takes (M — l)/2 steps, on average. With probability 1 — Prandom(-W) 
the trial is not successful. Thus the expected cost for all trials starting with M is 

cost(M) = + (1 - p random (M)) cost(min(2"/ 2 , 6M/5)) (4) 

When M > 2™/ 2 , further iterations have M = 2 n / 2 so Eq. [| gives 

2 n/2 _ ! 



cost (2 



2 Prandom(2 n /^) 



For hard, soluble random fc-SAT, p ra „dom(2™ /2 ) - 1/2 so cost(2"/ 2 ) - 2"/ 2 . This condition and 
Eq. allow computing the expected cost of the entire loop, i.e., cost(l), recursively. 



3.2 Using Problem Structure 

With the algorithm described here, using problem structure is conceptually straightforward: for a 
class of problems, such as random fc-SAT with given n and m, select values for the phase functions 
p(h, c) and tQi, b) to minimize the search cost for typical instances of the class. We take the number 
of steps in each trial, j, to grow only polynomially with n. Furthermore, m — O (n) for hard random 
fc-SAT. Thus the number of values to specify p and t grows polynomially with n. In particular, for 
j growing linearly with n, (rt 2 ) values completely specify these functions. 

We thus have a situation commonly found with developing conventional heuristics: a number of 
algorithm parameters to tune with respect to the class of problems. Generally, the heuristics arc 
too complicated to permit a useful analytical relation between the parameter values and algorithm 
cost. Instead, one takes a sample of problem instances and solves them with various choices for the 
parameter values. Numerical optimization techniques can then find parameter values giving good 
performance for the sample, e.g., minimizing the median search cost for the sample. These values 
are evaluated by using them to solve another sample drawn from the same problem ensemble. Since 
the cost of these heuristics grows exponentially for hard problems, this sampling technique is limited 
to relatively small problems. Nevertheless, efficient implementations often allow investigating SAT 
problems with hundreds or thousands of variables. 

These remarks also apply to heuristics for quantum computers. On a quantum machine, each 
trial requires only polynomial time. On the other hand, at least for most parameter choices, P S oin 
is exponentially small, thus requiring exponentially many trials to estimate P S oin on the sample's 
problem instances. Hence a direct attempt to find parameter values minimizing the median search 
cost would require exponentially many trials. One way to address this difficulty is to identify 
how good parameter choices scale with n and then perform the parameter value optimization with 
smaller n. An example of such scaling is having the phase parameters scale as as described 
below. Another approach makes use of the shift in amplitudes toward low-cost states, illustrated in 
. Thus, instead of maximizing Psoim we could minimize the expected cost of the state produced 
by a trial, a quantity easily estimated with a modest number of trials. 

Currently, however, such quantum machines do not exist. Instead, we must simulate the quantum 
algorithm on conventional machines, so each trial requires exponential cost and memory. Thus we 
are limited to investigating much smaller problems, up to 20 variables or so for SAT. In particular, 
the simulation evaluates properties of all search states and so is considerably more expensive than 
evaluating conventional heuristics. The latter, while having exponentially growing costs, typically 
evaluate only a tiny portion of the full search space. 

The number of function evaluations for a numerical optimization procedure grows with the 
number of values to optimize. Thus as a practical matter, we consider only a restricted set of 
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possible values with a smaller number of independent parameters. In particular, a study of this 
algorithm with a fixed number of steps |28| suggests restricting p and r to vary linearly with the 
number of conflicts c and number of 1-bits b, respectively, only slightly reduces the performance. 
We make this restriction in the specific form for the heuristic presented below. 

An alternate approach to finding good parameter values, also discussed below, uses an approx- 
imate analytical theory of the algorithm performance. The theory allows rapid evaluation of the 
approximate performance for a given choice of parameter values. We can then apply numerical 
optimization to find values giving high performance according to this approximation. This approach 
allows evaluating behavior for much larger problem sizes, but with the caveat of being only an 
approximation. 



3.3 Parameter Choices to Use Problem Structure 

To exploit problem structure, we introduce two real-valued functions i?(A) and T(A) defined over 
< A < 1. These functions specify the amplitude adjustments made, respectively, by the cost 
evaluation and mixing, as a function of the number of steps completed. Specifically, R and T define 
the phase adjustment functions as 

p(h,c) = p h c 
r(h, b) = r h b 



with 



I fh-l\ 

Ph = j R (—j-) [:>) 

1 (h-\ 
r h = -T 



3 \ 3 

for steps h — I, . . . , j. These values decrease as 1/j so P and U are close to identity matrices as j 
increases. When iterated over the j steps of a trial, these operations nevertheless substantially shift 
amplitudes among the assignments. The linearity of r(h, b) with respect to b means the elements of 
the mixing matrix are ^7J : 

Ah) = ( Anr h /2 >^\ n 



cos(^)) fl (-,tan(^i))" (6) 



Thus the elements decrease rapidly with d so the largest mixing is among states close to each other. 
In the case we consider, j grows as a power of n (allowing individual trials to complete in polynomial 
time). This means the off-diagonal terms of U corresponding to d = O (1) decrease as a power of n 
rather than the exponential decrease of the diffusion mixing matrix. 

Completing the algorithm requires explicit forms for R(X) and T(A) and the number of steps 
j. Ideally these quantities would minimize the expected total number of steps in all trials for the 
particular problem instance. For hard problems, such optimal choices are not known a priori. Thus 
we focus instead on functional forms giving good performance on average for random fc-SAT, i.e., 
depending only on the ensemble parameters n, k and m. While the values could vary from one trial 
to the next, in analogy with the procedure described above for amplitude amplification when the 
number of solutions is not known, for simplicity we use the same values for each trial. The expected 
cost to find a solution is then j/P so ]n- While such choices will not be optimal for each instance, they 
can nevertheless improve average performance, as shown below. 
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Figure 1: Solving a randomly generated 3-SAT problem with n = 20 and /i = 4.25. For each step h, the 
figure shows the probability p^ h \c) in assignments with each number of conflicts. Shading is based on the 
relative deviations of the amplitudes, described in the text. The small contributions for assignments with 
c > 15 are not included. This instance has 20 solutions. 

4 Algorithm Behavior 

This section illustrates the algorithm's behavior for small problems, and compares it to amplitude 
amplification. These observations motivate the approximate analyses of the following section. 

For fi = 4.25, using j — n and linear forms for R and T gives reasonably good performance. 
Specifically §7|, for 

R{\) = Ra + Rx{l-\) (7) 
T(A) = To + mi-X) 

with R = 4.86376, R x = -4.18118, T = 1.2 and T x = 3.1, Fig. | shows the behavior for one 
problem instance. These numerical values were determined from the approximate analysis, based 



on average amplitudes, discussed in §5.2 



This figure illustrates several properties of the algorithm. First, at each step, probability is con- 
centrated in states with a fairly small range of costs. Each step shifts the peak in the probability 
distribution toward assignments with fewer conflicts, until a large probability builds up in the so- 
lutions. This shift is also seen for other problem instances (with differing final probabilities) and 
when averaged over many samples. The peaks become sharper for larger n, with relative widths 
decreasing as O (1/y/n). By contrast, amplitude amplification increases the probability in solutions 
but all other amplitudes decrease uniformly. 

Second, the variation of amplitudes among states with the same cost is relatively large only in the 
last few steps of the algorithm and then primarily for higher-cost states for which the amplitudes are 
small. The shading in Fig. [l] shows this behavior, indicating the relative deviation of the amplitudes 
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Figure 2: P so i n as a function of the number of steps for several 3-SAT instances. The first plot shows the 
behavior of the heuristic algorithm with the same parameters as Fig. [j] and using j — n to define the phase 
parameters in Eq. [| The dashed curve is an instance with 80 solutions, the thick solid curve is the instance 
with 20 solutions of Fig. [I] and the thin solid curves are different instances with 5 solutions. The second plot 
shows the behavior of amplitude amplification with 80, 20 and 5 solutions for the dashed, gray and solid 
curves, respectively. For comparison, the 20-solution curve from the first plot is also included. 



(i.e., ratio of standard deviation to mean) for states with the each cost, ranging from white for zero 
deviation to black for relative deviations greater than 3. 

Fig. IU gives further insight into the algorithm. Unlike amplitude amplification, the heuristic 
reaches its maximum P so i n at about the same number of steps for problems with differing numbers 
of solutions. Instead, the variation is in the maximum value of P so iii- Even instances with the 
same number of solutions behave differently. Thus for this algorithm, identifying the appropriate 
number of steps is not an issue, rather the difficulty is in selecting appropriate phase parameter 
adjustments. As problem size increases, the number of steps required for amplitude amplification 
increases exponentially and always gives P so in ~ 1- By contrast, the heuristic uses a linearly growing 
number of steps but P so in gets small. 

To compare the net effect of these contrasting behaviors, we examine the search cost scaling of 
the two methods. Using the parameters of Eq. f?| Fig. || shows the growth of the expected search cost 
for randomly generated problems, i.e., the expected number of steps, j/P so ln(j)- The exponential 
fit gives the cost growing as e 010 ". As one caveat, we should note most of the P so in values are fairly 
large for these problem sizes, i.e., P so in > 0.3, thus usually finding a solution after only a few trials. 
Thus much of the variation in costs shown here is due to the linear growth of the number of steps 
j, and it may require larger problems to see the cost growth dominated by the behavior of P so in- 

Fig. [| shows another property of the heuristic: as long as j/n is not too small, P so i n does not 
change much as j increases. This can be understood from the scaling of phase parameters of Eq. |^. 
When j > 1, the algorithm matrices are close to the identity. In this situation, when j is doubled, 
the phase adjustments are halved so two steps change the amplitudes about as much as the original 
choice of j did in one step. As a further observation, the minimum median cost, j/mcdian(P so i n ), 
occurs at somewhat smaller ratios of j/n as n increases. Exploiting this decrease gives somewhat 
lower costs for the quantum heuristic than those shown in Fig. ||, which used j/n = 1. As a further 
observation, the value of j/n giving about half the maximum P so in for each n decreases close to 
linearly on a log-log plot with a slope of about -0.8, indicating the best scaling performance requires 
only j = 0(n ). If this behavior continues to hold for larger n, the approximate analysis discussed 
below, based on j 3> \/n, would somewhat overestimate the minimum possible costs. 

Finally we should note the distinction between Fig. || and Fig. |[ In the former, the phase 
parameters of Eq. |s| are defined using j = n and the behavior of P so in is shown for trials of various 
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Figure 3: Log-plot of median search cost vs. n for the quantum heuristic (diamond), amplitude amplification 
(assuming the number of solutions is known (triangle) or not (square)), and GSAT ^EJ with restarts after 
In steps (circle). For each n, the same 1000 soluble random 3-SAT problems with \i — 4.25 were solved with 
each method (except only 500 and 400 samples for n — 24 and 26, respectively). For those n not divisible 
by 4, half the samples had m = [4.25nJ and half had m larger by one. Error bars show the 95% confidence 
intervals Ejj, p. 124], which in many cases are smaller than the plotted point. The curves show exponential 
fits to the quantum heuristic (solid) and amplitude amplification (dashed). 



numbers of steps using these fixed parameters. In the latter figure, j varies and gives different phase 
parameters at each value of j/n, and P S oin is shown after completing j steps. 

4.1 Comparing with Amplitude Amplification 

Provided the number of solutions S is known, the cost for amplitude amplification is || j^2 n /S, 
also shown in Fig. [|. The values grow as e 30 ", i.e., about three times faster than the quantum 
heuristic. 

In practice, S is not known a priori, requiring the modified algorithm, described in § |3.l| , whose 
expected cost is less than four times larger ||, so does not affect the exponential growth rate. 
However, for the sake of a definite comparison with the quantum heuristic, which also does not use 
prior knowledge of the number of solutions, we compute the actual expected cost of the modified 
algorithm using Eq. |]. The resulting values, included in Fig. ||, are slightly less than twice as large 
as the cost for amplitude amplification when S is known. 

4.2 Comparing with a Conventional Heuristic 

Average costs for even the best known classical heuristics grow exponentially. For instance, Fig. ^ 
shows the search cost for a good classical heuristic, GSAT j45|, grows slightly faster than this quan- 
tum heuristic. The GSAT algorithm starts from a random assignment and, for each step, examines 
the number of conflicts in the assignment's neighbors (i.e., assignments obtained by changing the 
value for a single variable) and moves to a neighbor with the fewest conflicts. If a solution isn't 
found after a prespecified number of steps, e.g., because the current assignment is a local minimum, 
the search is tried again from a new random assignment. The most significant comparison between 
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Figure 4: Median solution probability vs. j/n for the quantum heuristic for 1000 random 3-SAT problems 
with n = 12 (solid) and n — 20 (dashed), and 60 samples with n = 26 (gray). For these cases /x = 4.25. 
Error bars show the 95% confidence intervals. 

GSAT and the quantum heuristic is the relative growth rates in the search costs, as measured by the 
number of steps. The corresponding actual search times will depend on detailed implementations 
of the steps. Although the number of elementary computational steps, involving evaluating the 
number of conflicts in an assignment (and, in the case of GSAT, its neighbors) are similar for both 
techniques, differences in the extent to which operations can be optimized away (e.g., as is possible 
in some cases for NMR-based quantum implementations ) and the relative clock rates of classical 
and quantum machines remain to be seen. 

At any rate, the figure shows that including the number of conflicts in the phase adjustments 
reduces the number of steps required for the quantum algorithm below that required for GSAT, on 
average. Because the trials are independent, both the quantum heuristic introduced here and GSAT 
can be quadratically improved with amplitude amplification , amounting to decreasing the growth 
rates shown in the figure by a factor of two. Such an improvement requires extending coherence 
across multiple trials, rather than just a single one. 

This technique also generalizes to allow the quantum heuristic presented here to work with 
the results of deterministic classical heuristics with independent trials (e.g., a deterministic version 
of GSAT in which, say, any ties are broken by selecting the first neighbor with minimum cost 
in a lexicographic ordering of the states, or the seed used with the random number generator is 
prespecified for all trials). Specifically, instead of basing the phase adjustment on the number of 
conflicts in a state, we run GSAT starting from that state for a fixed number of steps. We can 
then use the number of conflicts of the resulting state as the basis for the phase adjustment. This 
thus uses more information about the heuristic than just combining it with amplitude amplification, 
which tests whether the heuristic finds a solution Q. For the problem sizes discussed here, using 
this technique gives considerably higher probabilities to find a solution, even using the same phase 
adjustment parameters as used for the original method involving the number of conflicts in the states. 
However, the additional steps required to evaluate GSAT within each trial, results in a larger overall 
cost. Nevertheless, this technique may be useful for larger problem sizes, where the probabilities to 
find solutions are lower. 
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From this discussion, the structured quantum algorithm appears to improve on GSAT for hard 
SAT problems, but definitive statements cannot be made based only on such small problems. Un- 
fortunately, classical simulations of quantum machines incur an exponential growth in time and 
memory, preventing evaluation with larger problems. More extensive empirical evaluation requires 
either faster simulation techniques, perhaps approximate [pi, or quantum computers. 

5 Approximate Analyses of Behavior 

The usual approach to evaluating conventional heuristics, and tuning any adjustable parameters 
they may have, is by running them on a sample of problems. This is necessary because analytical 
methods are often unable to account for the complicated dependencies in the search path explored 
by the heuristic. As discussed in the previous section, such simulations are also useful for quantum 
algorithms, but are limited to small problems. 

As a complementary approach, we consider approximate analytical techniques. The average 
properties of random fc-SAT successfully help understand and improve search methods, both clas- 
sical |l3|, 0, U, |l^] and quantum p7j . In particular, the quantum algorithm operates with the 
entire search space at each step so its performance depends on averaged properties of the search 
states. For simple ensembles, such as random fc-SAT, such averages are readily computable and 
thus give asymptotic characterizations of the problems for large n. In addition to estimating algo- 
rithm performance, such analyses provide insight into the qualitative features of the behavior seen 
empirically. 

For a problem instance P, let ipi (P) be the amplitude for state s after completing step h of 
the algorithm. Initially, tps(P) = 2~"/ 2 : . A single step of the algorithm, from Eq. [l], gives 

4 h) (p) =E u S, S ) ei7rp " c(s) ^^ 1) ( p ) ( 8 ) 

s 

The remainder of this section discusses techniques using the properties of random fc-SAT to estimate 
the algorithm's behavior for large n, and suggest suitable choices for the phase functions. 

5.1 Average P so i n 

Ideally, we would like to estimate the typical search cost for problems in the ensemble. The expected 
cost for a given problem instance is j/P so in- Thus one quantity to examine is the ensemble-average 
07-Psoin) or, since, in the case considered here, j is the same for all instances, j(l/-P S oin)- However, 
this quantity is infinite if even a single problem instance is insoluble. Even restricting attention to 
soluble instances, we find a wide variation in solution costs. Thus the average is dominated by a 
small fraction of the instances and does not indicate typical behavior. A better indication is the 
median value of j / P so \m but is difficult to treat analytically. As an analytically tractable quantity, 
we focus instead on j/ (P so i n )- 

The random fc-SAT ensemble includes both soluble and insoluble instances, so (P S oin) < ^soluble, 
the fraction of soluble instances in the ensemble. Below the phase transition, near /i = 4.25 for 
random 3-SAT, P S oiubie — ► 1- For larger /z, P S oiubie ->0asn increases, in which case the performance 
for soluble problems is given instead by (-P S oin)/-Psoiubic- Unfortunately, the random fc-SAT ensemble 
does not have a simple expression for P S oiubic, or even just its leading exponential scaling rate, 
precluding an exact evaluation for overconstrained soluble problems. One approach to estimate this 
behavior uses empirical classical search to evaluate -P so iubie for & range of problem sizes for a given 
value of [i. The behavior of these values as a function of n then estimates the scaling of P S oiubie- 
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For instance, samples of 10 4 problems for n from 50 to 250 show [Q close to exponential decrease 
of P S oiubie for /i values somewhat above the transition. The resulting estimates of the actual decay 
rates for P so iubie are 0.011, 0.025 and 0.045 for u equal to 4.5, 4.7 and 4.9, respectively. Analytic 
bounds on the behavior of P so iubie for u between 4.2 and 5.2 are difficult to obtain. One such result 
is -Psoiubio < exp(— 5.9 x 10~ 5 n) at fj, = 4.762 Q, which is a considerably smaller decay rate than 
suggested by empirical evaluation. Above u = 5.2, the expected number of solutions, (S), goes to 
zero, so the Markov bound P so iubic < (S) provides another constraint. 
For random fc-SAT (P so in) is 

^E E iv^(p)i 2 o) 

clauses p s | c ( s ) =0 

where the outer sum is over the N^ auscs possible problem instances and the inner sum is over those 
assignments s that are solutions for P. Since the clauses are selected independently, the sum over 
problems is equivalent to m sums, each of which ranges over AT clauses possible clauses. Interchanging 
the order of summation gives an outer sum over all assignments s and an inner sum over those 
problem instances for which s is a solution, i.e., instances containing no clause conflicting with s. 
Since the random fc-SAT ensemble treats all assignments equally, this sum over problems is the same 
for all choices of s. Thus we can focus on a single assignment, say s = ... 0. For an assignment 
r and clause a, let a(r, a) = 1 if a conflicts with r and otherwise is zero. Then c(r) for problem 
instance P is the sum of a(r, a) over the clauses in P. With this notation, Eq. [j] gives (P so in) equal 
to 

E (ft^*) (a7^;E cx p(^E^K^^))) (io) 



x rf(L«h-i)' v ' h = u dK' s ' h -i)' ah = a ( s,l - 1 ' cr )' a 'h = "Oft-n ") and we denne S 3 = S 'j = 
s. The a sum is over all clauses not conflicting with s = . . . 0, i.e., those a with a(s, a) = 0. 

For constant j, an exact asymptotic analysis |2j| shows (P so in) decays exponentially but at smaller 
rates than the exponential growth in number of steps required by amplitude amplification. Fig. ^ 
shows examples for j up to 5: specifically the decay rate A defined by (P so in) ~ exp(— An). For 
example, with fi = 4.25, for j = 5 the decay rate is A = 0.13, only slightly larger than the empirical 
growth rate of the median cost shown in Fig. ^| when j = n. This may indicate the problem sizes 
feasible to simulate are not large enough to show the full benefit of allowing j to grow with n. 

This analysis is useful in suggesting the scaling behavior of Eq. || for the algorithm's parameters 
and shows (P so in) decreases less rapidly as j increases. Unfortunately, this analysis is not applicable 
to the more interesting situation where the number of steps j grows with n. While it may be 
possible to develop approximations for j > 1, a simpler approach uses the observed properties of 
the amplitudes seen in §||. This approach is described in the remainder of this section. 

5.2 Average Amplitudes 

Empirical evaluations show that amplitudes for states with the same number of conflicts are generally 
quite similar. This observation motivates an analysis based on the behavior of the average amplitude 

for states with each cost |27|. Consider the quantity A^S 1 defined as (^^(P)) with the average 



c 

first over all states s with c(s) = C in the problem instance P, and then over all problems in the 
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Figure 5: Minimum decay rates for (P so i n ) as a function of fj, = m/n for (from top to bottom) j — 1 through 
5 steps using linear variation in phase parameters with step number (i.e., of the form given in Eq. fj]), but 
with different numerical values for each j. The points indicate empirical estimates of the decay rate for 
P, soluble, a lower bound on the decay rate for P so i n . The upper edge of the filled region is, in turn, a lower 
limit on P S oiubie given by the Markov bound using the expected number of solutions. 



random fc-SAT ensemble with given n and m. Assuming amplitudes for states with the same cost 
are the same, at least for the dominant cost states at each step, Eq. || becomes 

^^^^(C,^ 1 ) (11) 

d,c 

where Vd(C, c) is the expected number of states with c conflicts at distance d from a state with C 
conflicts. Significantly, Vd(C, c) is a property of the problem ensemble, independent of the algorithm 
details. For random fc-SAT, Vd(C,c) is a multinomial sum described briefly in the appendix. 

Simulations show probability concentrates in a small range of cost values, as illustrated schemat- 
ically in Fig. ||. These dominant costs are, in turn, close to the average cost J2 C Cv(C)\Ac\ 2 where 
v(C) is the expected value, for random fc-SAT, of the number of states with C conflicts. We can 
thus expand A c m AqZ c ~ c , around the average cost, with Z a complex number depending on the 
step. 

In one step, Eq. |ll| implies Z changes by O (1/j). So for j ^> 1, Z becomes a smooth function 
of A = h/j satisfying the differential equation |^7j 



where X = i_J(iLfz|^ and 

/ = exp I -kn{l - Z) 



pQ- - x) _ x 

1 -p z 

The initial condition, corresponding to all amplitudes equal, is Z(0) — 1. With this approximation, 
the dominant cost value is x m - 
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Figure 6: Schematic behavior of average amplitudes, on a logarithmic scale, as a function of number of 
conflicts c. The average number of states with c conflicts, v(c), is sharply peaked around the average 
number of conflicts m/2 fc . When the magnitude of the amplitudes decreases rapidly with c, as shown here, 
the probability in states with c conflicts is also sharply peaked, but at a somewhat lower value, corresponding 
to the shift seen in Fig. |l| Quantitatively, the values decrease exponentially with n, so the logarithms, shown 
here, are proportional to n and the relative width of each peak is O (l/^/ru. 

With suitable choices for R and T, such as those in Eq. ^ for k = 3, fj, = 4.25, Eq. [l2] gives 
Z(l) = thereby predicting most of the amplitude concentrates in states with the fewest conflicts, 
i.e., solutions if the problem instance is soluble. 

5.3 Including Variation Among Amplitudes 

The approximation based on average amplitudes shows good correspondence with empirical eval- 
uation for most of the steps of the algorithm. However the variation among amplitudes with the 
same costs increases for the last few steps of the algorithm, as illustrated in Fig. [l]. If this variation 
remains significant as problem sizes increase, especially among states with fairly low costs, it remains 
possible that the small averages predicted for nonsolution states when Z(l) = are due to large 
variation in the phases of the amplitudes rather than small magnitudes, leading to somewhat less 
concentration in solution states than predicted. 

It is thus useful to estimate the contribution from this variation to the behavior of the algorithm. 
A direct approach would consider the ensemble-average of the variance in amplitudes among states 
with each cost. Such an analysis gives a similar prediction, namely appropriate phase functions can 
concentrate amplitude sufficiently into low-cost states to give high average performance. To avoid 
introducing significant variation in amplitudes for states with the same cost, the resulting phase 
adjustments are smaller than those based on the behavior of the average amplitudes alone. Using 
such parameters for small problems gives significantly lower P so i n values, and hence higher costs, 
than shown in Fig. ^. Since the analysis assumes s/n ^> 1, this poor performance for small n could 
be due to the small problem sizes. 

Another possibility is the analysis based on the variance of amplitudes overestimates the effect 
of amplitude variation. Specifically, in the case treated here, where the number of steps grows with 
n, most contribution to the amplitude of a given state is from other states relatively near to it. This 
is due to the decreasing values of t/j in Eq. causing the mixing matrix elements Ud in Eq. o to 
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Figure 7: Distance relations and costs for the four assignments r, r' , s and s'. 



decrease rapidly with distance. Thus large variations among amplitudes for states that are far apart 
do not much affect the result of a single step. Conversely, nearby states share many of the same 
neighbors so their amplitudes are likely to be more correlated than those of distant states. This 
means a useful characterization of the amplitude variations should also account for the distance 
between the states. Thus we consider an approximation based on the assumption that nearby states 
with the same costs have approximately the same amplitudes. 

Consider the quantity Sq C c , defined as ^\p)ij; { )\P)\ with the average first over all pairs 

of states s, s' such that d(s, s') — D and c(s) = C, c(s') = C in the problem instance P, and then 
over all problems in the random fc-SAT ensemble with given n and m. A mean-field approximation 
with Eq. H gives 



S 



D,C,C ~ / j 

d,d' .8,c,c 



(13) 



where VD,d,d',s{C, C , c, c') is the ensemble average of the number of assignment pairs s, s' with costs 
c, c', respectively, with distance relations d(s,s') — 5, d(r,s) — d, d(r',s') — d, averaged over all 
assignment pairs r, r' with d(r,r') = D and costs C,C, respectively, as illustrated in Fig. ^. 

Eq. |l^ uses V£>.d,d' ,s{C, C , c, c'), which characterizes the relevant structure of the problems and 
is independent of the search algorithm choices for R(X) and T(A). As with Vd(C,c), this 4-state 
structure quantity is a sum of products of multinomials. As described in the appendix, an expansion 
similar to that described above for the average amplitude Ac gives an asymptotic expansion of 
Eq. [l^ for large problem sizes. Specifically, we express the behavior for S with the expansion near 
D = 0, c«C, dm D of Soc Y d X c , with X = re ie and Y depending on the step. For j > 1, these 
values change slowly from one step to the next giving differential equations: 





= ttt r 






r\ 
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= nR- 



irT Y 2 k[iF- (1-r) (1+p (-1 + Jfer)) sin(6») + Gsin(B) 

1 — p 

kYF — - — sin(6») 
l-p 
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(14) 



with 



F = exp (-ukfi(l + r 2 - 2r cos(6»))) 
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Figure 8: Behavior of solutions to approximations as a function of A = h/j, with the same phase parameters 
used in Fig. ^| Solid curves are from Eq. jTi] and dashed from Eq. The black curves are r = \X\ (solid) 
and \Z\ (dashed) while the light gray curves are the arguments of X (i.e., 9) and Z. The thick dark gray 
curve is Y. 



G = exp(vkfi ((l + r 2 ) cos(0) -2r)) 
B = vk n (-1 + r 2 ) sin(0) 

and v — 1 _ p (i_ r ^) ■ The initial conditions are r(Q) = 1, 0(0) = and Y(0) = 1, corresponding to all 
amplitudes equal. The equations for Y and r are unchanged by adding any multiple of 2ir to 9. 



5.4 Predicted Behavior Using Amplitude Variation 

Fig. H shows the solution of Eq. |l2| and Eq. |lj for the choice of R and T of Fig. [j]. For these 



parameters Eq. 12 gives \Z\ — > predicting good performance. But Eq. |14| has |X| = r remaining 



positive. For small A, the variation among amplitudes for states with the same cost is small, so 
^d^cc ^ -^c -^c>* i corresponding to X w Z. As A increases, these quantities differ significantly 
so the two approximations make quite different predictions for the form of the amplitudes at the 
end of the trial, i.e., at A = 1. 

As one quantitative evaluation of the approximation including amplitude variation, we can 
compare its prediction of the scaling of P so i n with that seen in Fig. |^. This approximation has 
(|-i/> s | 2 ) oc |A^| 2c for states with c conflicts. Thus, assuming this expansion holds not only for domi- 
nant c but also extends to c = 0, i.e., solutions, the probability for a solution, as described in Eq. [l5| 
of the appendix, is 

1-p 



with p = 2~ k . The solution to Eq. [l4| shown in Fig. || has \X\ = r = 0.399 at A = 1, giving (P so i n ) ~ 
exp(— 0.0957n) since m = 4.25n, and thus estimates the cost j/ (P so i n ) growing as exp(0.0957n), very 
close to the observed growth of e 010n in Fig. ||. Note the latter quantity is based on median costs 
of soluble problems while the theory is an estimate of j / (P so in) for all problems. 

Fig. U shows the contribution from the spread remains small, i.e., Y is near 1, for most of 
the steps, and then decreases. This corresponds to empirical observations of the behavior of the 
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algorithm where amplitudes for states near the dominant cost have relatively little variation until 
the last few steps [pTj, as illustrated in Fig. [ll Thus Eq. [l4| provides an account of this behavior. 

These observations show the relations among groups of four assignments in random SAT prob- 
lems, used to derive Eq. ff4j, give a fuller account of the algorithm behavior than the simpler theory 
of the average amplitudes. It does not, however, account for all behaviors. For example, as shown 
in Fig. ^, continuing the trial beyond step j, i.e., for A > 1, gives small oscillations in P so in up to 
a bit below A = 2 followed by a drop to P so in ~ 0. On the other hand, continuing the solution of 
Eq. |l4| beyond A = 1 gives small oscillations in r even beyond A = 3. That is, Eq. [lj fails to account 
for the drop in P so i n beyond A ~ 2 for these parameters. Examining the amplitudes shows they 
develop multiple peaks so there is no longer a single small range of dominant costs as assumed in 
deriving Eq. [li] from Eq. [ll| Nevertheless, Eq. [l4| appears reasonable for describing the behavior 
over the range of most value for finding solutions, i.e., the range over which the bulk of the amplitude 
concentrates in low-cost states. 

Of particular interest is whether this approximation can also suggest improvements to the al- 
gorithm, i.e., the choices of the functions P(A) and T(A). As illustrated in Fig. ||, the solutions to 
Eq. [l4| take on finite nonzero values after the final step, at A = 1, for most choices of the phase 
functions. However, there are special cases in which r(l) = 0. 

To see what this requires, note that r\ in Eq. [li] is proportional to Y. Thus for r to decrease to 
zero, it is important to prevent Y from also becoming small too rapidly. Examining the right-hand 
sides of Eq. [l4| shows Y decreases much more rapidly than r, when r is small, unless 9 is near n. 
Thus one way to have r(l) = is for 9 — ► tt while Y remains bounded above zero. In this case, the 
1/r term contributing to 9\ in Eq. [l4| becomes large. Thus if 9 is to approach tt smoothly, the phase 
adjustment i?(A) must also be large near A = 1. In particular, the linear form of Eq. |?] near A = 1 
is not be sufficient to allow r — > 0. 

Solutions of Eq. [l4| with r(l) = will not, in general, also satisfy the initial conditions r(0) = 1, 
9(0) = and F(0) = 1. Nevertheless, appropriate choices of the phase parameter functions, P(A) and 
T(A), satisfy both sets of conditions. These choices can be found numerically using parameterized 
forms for these functions and adjusting the parameters to match the required conditions. In these 
cases r(A) = 6 (1 — A) near A = 1. So for finite n, when A = 1 — O (1/j), i.e., the last few steps, 
we have r = 0(1/ j). Thus this approximate analysis indicates j/(P so \n) is polynomial in n, hence 
predicting high performance is possible when the number of steps is much greater than y/n. 

The choices for the phase parameters are not unique. The additional flexibility may be useful 
to minimize the variation in performance among different problem instances in the ensemble. More 
significantly, the need for large phase adjustments for the last few steps to have r — ► suggests the 
asymptotic character of the algorithm may change in the last steps of a trial. In particular, the 
large adjustments may mean the differential equations of Eq. |l4| are no longer good approximations 
for the discrete map Eq. |l3|, thereby requiring a more detailed asymptotic analysis for the behavior 
in the last few steps. In particular, this observation highlights two distinct approximations: first 
replacing Eq. || by Eq. [l3] and then estimating its asymptotic behavior by a system of differential 



equations in Eq. 14. The latter approximation depends on relatively small changes from one step to 
the next ]5l|] which no longer holds when using large phase parameters. 

Empirically, using the large R values near A = 1 required by this analysis gives small values for 
P so in for the small problems feasible to simulate when j n. However, when the number of steps 
j is taken quite large, e.g., j as 100 for n = 14, large R values for the last few steps do give large 
P so i n , but then the cost j/P so i n is large due to the large number of steps. Thus a good test of this 
theory's predictions is beyond the range of feasible simulation. 

Nevertheless, the analysis indicates a change in behavior for the last few steps so we consider 
separate choices for ph and t% for the last few steps. That is, we numerically optimize the values 
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separately for the last few steps while continuing to use the linear variation of Eq. for the remaining 
steps. For example, Fig. ^ shows the performance using separate values for the last two steps 
and reducing j/n to minimize median cost as suggested by Fig. ^. The figure shows improved 
performance with these adjustments, i.e., using a slower growth of j, e.g., j = (n°' 2 J, and different 
phase adjustments for the last two steps. An exponential fit to the new values gives the median cost 
growing as e 08 ™, a somewhat smaller growth rate than the original heuristic. 

Because the numerical optimization requires solving a sample of problems multiple times, finding 
good values for p and r is only possible with small samples, e.g., 50 instances with n = 16, 10 with 
n = 22 and 1 with n = 24. For n = 26, parameter optimization is not feasible so the figure shows the 
behavior using the parameters found for n = 24. Thus the resulting optimal phase values for these 
small samples are not likely to be the best possible for the ensemble as a whole. Hence the reduced 
median costs shown in Fig. [9] are upper bounds on the possible performance of the heuristic for 
these problem sizes. A more comprehensive evaluation would optimize phase parameters separately 
for every step and use larger training samples. Such a procedure is only feasible for even smaller 
problems than shown in the figure. Nevertheless, it appears likely that linear variation in the phase 
parameters is quite good for all but the last few steps. 



6 Discussion 

As we have seen, search state properties are readily incorporated in quantum search algorithms 
through amplitude phase adjustments. Properly selected, such adjustments achieve lower overall cost 
than unstructured search, and require less coherence time for the quantum operations. On the other 
hand, the additional complexity of such algorithms precludes a simple analytic expression of their 
average cost and hence makes it difficult to identify those phase choices giving the minimum cost. 
Nevertheless, approximate techniques provide reasonably good choices and indicate the possibility 
of polynomial search cost, on average, for hard random fc-SAT. The approximations also explain 
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qualitative features of the algorithm behavior such as the gradual shift in amplitudes toward low- 
cost states and the increasing amplitude variance in the last few steps of a trial. Moreover, it 
appears possible to achieve good average performance with phase parameters depending only on the 
ensemble parameters n, k and m, rather than values tuned to each problem instance. 

The averaging procedure is useful because the quantum algorithm evaluates the entire search 
space and hence incorporates information from all states. By contrast any single run of a classical 
heuristic samples only a relatively few states which are unlikely to be typical of the search space as a 
whole, hence precluding theoretical analyses based on average state properties. Thus while quantum 
heuristics are difficult to evaluate empirically due to the exponential cost of their simulation on 
classical machines, they could allow a simpler, though still approximate, theoretical analysis than is 
possible for classical heuristics. 

A number of extensions are possible. First, the amplitude shift of Fig. [l] means even if a solution 
is not found after a trial, the measured state likely has relatively low cost. Thus, like local classical 
search methods such as GSAT but unlike amplitude amplification, the algorithm applies directly 
to combinatorial optimization, i.e., finding a minimal conflict state [|l7|. For example, the shift 
in amplitudes toward low-cost states is seen in satisfiability problems with no solutions and the 
traveling-salesman problem |n| . 

Second, the mean-field analysis also applies to other classes of search problems, provided the 
probabilities relating problem properties can be determined. This is possible for a variety of com- 
monly studied search ensembles such as coloring random graphs. Ensembles of real-world problems 
lack analytically known probability distributions, but sampling representative instances allows es- 
timating P(c\c',d). Such estimates may even be useful for analytically simple ensembles, allowing 
some tuning of phase parameters for a particular problem instance. Conversely, which problem 
classes have so little correlation among search state properties that quantum algorithms are unlikely 
to be particularly useful, on average? Such classes may be useful for cryptographic applications p8[ . 

Analysis of problem structure can also indicate how the cost varies through the search space 
in giving local minima, plateaus, etc. |2j], pjj . Such information may help evaluate other types of 
quantum algorithms that rely on properties of the cost function throughout the space, such as those 
using partial assignments ^ |2^]. As another example, a continuous evolution approach pj ] depends 
on the nature of the eigenvalue spectrum of Hamiltonians encoding the problem costs and hence 
may benefit from an ensemble analysis of problem structure. 

Third, in common with amplitude amplification ||] and some classical methods |39|, the growth 
ofp^(O), as seen in Fig. [j], means stopping a bit before the largest probability reduces the expected 
cost. More generally, a mixture or "portfolio" of trials with somewhat different parameter values 
could give improved trade-offs between expected costs and the variation in costs seen among different 
instances 32, pp|. 

Fourth, the heuristic can readily incorporate other computationally-efficient properties of the 
search states as additional arguments to the phase function p. One such a property is how the 
number of conflicts in a state compares to those of its neighbors, which is used by a number of 
conventional heuristics including GSAT. Moreover, in analogy with quadratically improving conven- 
tional heuristics with amplitude amplification we could also evaluate a conventional heuristic, 
such as GSAT, for a fixed number of steps and use the cost of the resulting state to adjust phases 
(either instead of or in addition to the cost of the original state) . In this case we would be searching 
not for a solution state directly but rather for a "good" initial state, i.e., one from which the conven- 
tional heuristic rapidly finds a solution. In fact, using just a few steps of GSAT with random SAT 
instances with n — 12 and 20 shows the same shift toward low-cost states as seen in Fig. [j], and the 
resulting P S oin is larger. However, for these problem sizes, the P S oin values in the original algorithm 
are sufficiently large that even if using a few steps of GSAT were able to increase P so in to equal 1, it 
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would not reduce the overall trial cost due to the additional steps involved in evaluating GSAT. Nev- 
ertheless, this approach may be useful for larger problem sizes and illustrates the potential trade-off 
between the cost of the procedure evaluating search state properties and the resulting probability 
for a solution, which determines the expected number of trials. In summary, introducing additional 
properties in the phase adjustment may give better performance, but increases the possible number 
of distinct parameter values. Thus numerical optimization of parameter values is likely to be more 



An interesting open question is whether this heuristic can benefit from using different parameters 
and numbers of steps for each trial, as used for amplitude amplification when the number of solutions 
is not known. As with amplitude amplification, the simulations indicate a wide range of performance 
among different instances with the same n and m, even if they have the same number of solutions. 
This approach would rely on the variation among problem instances, not addressed by ensemble 
averages. Furthermore, the series of low-cost states returned by the unsuccessful trial may also be 
useful indications of problem structure. This provides another contrast with amplitude amplification 
where unsuccessful trials simply return randomly selected nonsolution states, with no bias toward 
lower costs. 

While this discussion is encouraging, we should note its limitations. The theory does not pro- 
vide rigorous bounds on the average search cost. Moreover, even if the algorithm performs well on 
average, it has no guarantee for specific instances. Nevertheless, restricting consideration to algo- 
rithms whose behavior is analytically simple underestimates the potential of quantum computers for 
typical searches, just as is the case for conventional search algorithms. With ongoing developments 
in error correction |46[ |37j] and implementation jlO[ || |34|, |d], |J |35) , quantum machines with 
even a modest number of bits and limited coherence time could help address these issues by evalu- 
ating heuristics beyond the range of classical simulation. This will be particularly useful for more 
complicated heuristics, using additional problem properties, whose theoretical analysis is likely to 
be more difficult. Exploring their behavior will identify opportunities quantum computers have for 
using information available in combinatorial searches to significantly improve performance. 
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Figure 10: Behavior of P(c\C,d) for n = 20, k = 3, m = 80 and C = 3. The values for d = are not 
included: P(c\C, 0) is one when c = C and zero otherwise. 



A Derivation of Behavior of S 

This appendix describes the derivation of the equation for the behavior of the spread. 
A.l Problem Structure 

The algorithm adjusts phases based on the cost associated with each state, and mixes amplitudes 
based on Hamming distance between pairs of states. Evaluating Eq. [l3| requires the relation between 
distance and difference in cost. For random fc-SAT, the required probability distributions are based 
on multinomial distributions, which are approximately gaussian for large problems. 

The probability an assignment has cost C is P(C) — (™)p c (l — p) m ~ c where p = 2~ k is the 
probability a single clause conflicts with a given assignment. The expected number of states with 
cost C is v(C) = 2 n P(C). As one application, if the amplitudes after step h satisfy \ip s \ 2 cx a c ^ 
for some constant a, then the probability to obtain a state with c conflicts p^ 1 ' (c) is proportional to 
P(c)a c giving 

^ = p -%-.))- (15) 

In particular, p( h \0) is the probability to obtain a solution. 

Similarly, the probability two states separated by distance d have costs C and c, respectively, is 
given by a sum of multinomials depending on the number of clauses conflicting with both states [p7| . 
The corresponding conditional probability P(c\C, d) is peaked for c values close to C when d«B, 
as illustrated in Fig. |l0|. As n increases, the relative width of the probability distribution decreases 
as 1/y/ri, leading to a high correlation between cost and distance for nearby states. The expected 
number of states with c conflicts at distance d from a state with C conflicts, Vd(C, c), is Q)P(c|C, d). 

The quantity VD,d,d',s{C,C' ,c, c') is the sum, over all groups of four states r,r',s,s' with the 
specified distance relations, of the probability P(C, C, c, c'|r, r' , s, s') those states have, respectively, 
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Figure 11: Grouping of variables based on assigned values in assignments r, r' , s and s' , each shown as a 
horizontal box schematically indicating values assigned to each of the n variables. In each assignment, the 
value given in r to a variable is shown as white and the opposite value as black. In this diagram, variables 
are grouped according to the differences in values they are given in the four assignments. For instance, the 
first group, consisting of wo variables, has those variables assigned the same value in all four assignments. 
The fourth group, with wa variables, has those variables with the same values in r and r' , but opposite 
values in s and s' . 



costs C,C',c,c'. For random fc-SAT, this probability depends only on the way these states share 
variables with the same assigned values, specified by W = {wo, Wi, . . . , W7} and illustrated in Fig. [ll]. 
For example, wq counts the number of variables assigned the same value in all four states. These 
possibilities completely specify the distances between the states, namely, 

D = d(r, r') = W4 + w$ + we + wj (16) 
d = d(r, s) = w 2 + w 3 + we + w 7 

d' = d(r', s') = Wi + W3 + U>4 + Wq 

5 = d(s, s') — wi + W2 + w 5 + we 

For a given set of values W, there are N(W) — 2 n ( ul ™ w J corresponding choices for the four states. 

Generalizing the case for two states, the probability P(C, C , c, c'\W) is a multinomial sum over 
the ways the clauses can be selected to conflict with different subsets of the states, constrained 
to give the specified number of conflicts to each of the states. These clause selections are deter- 
mined by the principle of inclusion and exclusion p2| . Finally, VD,d,d',s{C,C' , c, c') is the sum of 
N(W)P(C, C , c, c'|W) over those choices of W matching the specified distances between the states. 

Because of the constraints on the conflicts and distances, the resulting 4-state probability does 
not have a simple closed form. It is nevertheless readily calculated and, for large problems, is 
approximately a normal distribution. For use in the expansion described in the next section, this 
distribution is multiplied by powers and summed, which can be done directly using the multinomial 
theorem. 



A. 2 Expansion for Large Problem Sizes 



Eq. 13 simplifies in the limit of large n using the following observations of its structure. First, for 
weak mixing, i.e., when t>, is taken to be O (1/n), the ua values in Eq. ^ decrease as n~ d so the 
main contributions are from terms with d, d' <C n. Second, as illustrated in Fig. nearby states 
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generally have about the same cost so the c, d sums in Eq. [l3] are dominated by the values close to 
C, C, respectively. Finally, from Fig. [7| small values for d, d' also require S ~ D. 

Thus to evaluate Eq. [l3[ expand Ss, c ,& as Sd,c,C'Y S ~ D X c ~ c \X*) c ' ~ c ' for values of S, c, d close 
to D and the dominant C, C . We are particularly interested in the behavior for D near zero. 

With this expansion, the sum over c, d in Eq. [l| is sum of a multinomial multiplied by powers, 
which is readily evaluated for given distance relations W in terms of the fraction of clauses conflicting 
with various subsets of the four states. 

For the sums over d,d', the restrictions d,d' <C n mean the variable groups shown in Fig. ^ 
are all much less than n except possibly for the two groups contributing to neither the value of d 
nor d' . From Eq. [l6] these are wq — n — D — w\ — u>2 — W3 and 11)5 — D — W4 — Wq — w-j. This 
observation, combined with the contributions from the Ud factors in Eq. [l3|, allow the d, d' sums to 
be approximated as exponentials. 

For evaluating the probability in states with cost C at step h we need only c whose value 
and change from one step to the next is determined by the behavior for D <C n and hence C close to 
C. Furthermore, the bulk of the probability is concentrated in states with a narrow range of costs. 
Thus we can focus on the behavior near the dominant C value at each step. 

Let X = re 10 with r and 9 real-valued. With a narrow distribution of costs, the dominant C 
equals the average, i.e., J2c CP(C)S(0, C, C) cx £ c CP{C)r 2C . Thus the dominant C equals §3 
r 2 vm where v = Y--p{\-r' 2 ) • Hard random fc-SAT problems have m cx n, so significant amplitude is 
in low-cost states whenever r is of order 1/y/ri. When r <C 1/y/n, the lowest cost states (i.e., the 
solutions if the problem instance is soluble) have most of the amplitude. 

Expanding around the dominant C value then produces Eq. |l4| . 
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